38 research outputs found

    Escaping Saddle Points with Adaptive Gradient Methods

    Full text link
    Adaptive methods such as Adam and RMSProp are widely used in deep learning but are not well understood. In this paper, we seek a crisp, clean and precise characterization of their behavior in nonconvex settings. To this end, we first provide a novel view of adaptive methods as preconditioned SGD, where the preconditioner is estimated in an online manner. By studying the preconditioner on its own, we elucidate its purpose: it rescales the stochastic gradient noise to be isotropic near stationary points, which helps escape saddle points. Furthermore, we show that adaptive methods can efficiently estimate the aforementioned preconditioner. By gluing together these two components, we provide the first (to our knowledge) second-order convergence result for any adaptive method. The key insight from our analysis is that, compared to SGD, adaptive methods escape saddle points faster, and can converge faster overall to second-order stationary points.Comment: Update Theorem 4.1 and proof to use martingale concentration bounds, i.e. matrix Freedma

    Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks

    Full text link
    Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated method for connecting scientific literature to synthesis insights. Starting from natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for arbitrary materials. We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses. We demonstrate that the model learns representations of materials corresponding to synthesis-related properties, and that the model's behavior complements existing thermodynamic knowledge. Finally, we apply the model to perform synthesizability screening for proposed novel perovskite compounds.Comment: Added new funding support to the acknowledgments section in this versio

    Pan-cancer analysis of whole genomes

    Get PDF
    Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale(1-3). Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter(4); identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation(5,6); analyses timings and patterns of tumour evolution(7); describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity(8,9); and evaluates a range of more-specialized features of cancer genomes(8,10-18).Peer reviewe

    Learning and optimization in the face of data perturbations

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 145-163).Many problems in the machine learning pipeline boil down to maximizing the expectation of a function over a distribution. This is the classic problem of stochastic optimization. There are two key challenges in solving such stochastic optimization problems: 1) the function is often non-convex, making optimization difficult; 2) the distribution is not known exactly, but may be perturbed adversarially or is otherwise obscured. Each issue is individually so challenging to warrant a substantial accompanying body of work addressing it, but addressing them simultaneously remains difficult. This thesis addresses problems at the intersection of non-convexity and data perturbations. We study the intersection of the two issues along two dual lines of inquiry: first, we build perturbation-aware algorithms with guarantees for non-convex problems; second, we seek to understand how data perturbations can be leveraged to enhance non-convex optimization algorithms. Along the way, we will study new types of data perturbations and seek to understand their connection to generalization.by Matthew James Staib.Ph. D.Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc
    corecore